Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tickbox and Math Protection #39

Merged
merged 34 commits into from
May 11, 2021
Merged

Add Tickbox and Math Protection #39

merged 34 commits into from
May 11, 2021

Conversation

zkamvar
Copy link
Member

@zkamvar zkamvar commented May 1, 2021

Description

This fixes TWO issues with one solution!

What is the solution?

  • added the @asis attribute in the xml definition that allows text to be passed from xml to markdown without being escaped
  • added find_between(): a function that searches between patterns (used to find nodes between both $$ and $).
  • added md_ns(): a named alias for the commonmark namespace
  • added protect_math(): a function for protecting math elements (not default)
  • added tickbox protection for GitHub checkboxes/tasklists
  • added add_node_siblings(), an internal function that will take a nodelist and add them as siblings following a given node in the document with the option to have the siblings replace the node in question.
  • added procedure for splitting text nodes, recombining them, and adding them to the document.

Related Issue

This addresses #38 by allowing dollar-notation LaTeX math to be protected (though bracket notation is still needed). It also fixes #10 and supersedes #20.

Example

library("tinkr")
m <- yarn$new(system.file("extdata", "math-example.md", package = "tinkr"))
m$head(15) # math formatting broken (tickboxes in tact!)
#> ---
#> title: An example with math elements
#> ---
#> 
#> This is cheap, it only costs 10$!
#> 
#> This example has $\\LaTeX$ elements embedded in the
#> text. It is intended to demonstrate that m $\\alpha\_\\tau$ h
#> mode can work with tinkr. $y =
#> mx + b$
#> 
#> - [ ] This is an empty checkbox
#> - [x] This is a checked checkbox
#> - [This is a link](https://ropensci.org)
#> - \[this is an example\]
m$protect_math()$head(15) # formatting rescued!
#> ---
#> title: An example with math elements
#> ---
#> 
#> This is cheap, it only costs 10$!
#> 
#> This example has $\LaTeX$ elements embedded in the
#> text. It is intended to demonstrate that m $\alpha_\tau$ h
#> mode can work with tinkr. $y =
#> mx + b$
#> 
#> - [ ] This is an empty checkbox
#> - [x] This is a checked checkbox
#> - [This is a link](https://ropensci.org)
#> - \[this is an example\]
m$tail() # block math protected!
#> 
#> $$
#> Q_{N(norm)}=\frac{C_N +C_{N-1}}2\times
#> \frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}
#> $$

Created on 2021-05-07 by the reprex package (v2.0.0)

Created on 2021-05-06 by the reprex package (v2.0.0)

This is definitely still a work in progress!
zkamvar added 16 commits May 1, 2021 18:38
commonmark wraps text in <emph></emph> tags. If I add `asis` to
them, they disappear... which probably means that I just need to add
the `asis` tag to the text nodes :face_palm:
This is why when you run into odd things during development, to
restart and work with a clean environment, because sometimes you
screw up.
This allows us to include the delimiters for the search
Deer lord, I hope this makes sense six months down the line
@zkamvar zkamvar marked this pull request as ready for review May 6, 2021 00:43
@zkamvar zkamvar marked this pull request as draft May 6, 2021 00:47
@zkamvar zkamvar requested a review from maelle May 6, 2021 15:55
I had modified `find_between()` from pegboard:::find_between_tags()
(https://github.com/carpentries/pegboard/blob/378c627b4e08869540aa049b678f9c4a560dff59/R/div.R#L125-L159)

My initial modification to extract all of the descendants was in the
context of block math where I knew that I just needed to add the 'asis'
attribute to the nodes. In the context of inline math, where I needed to
create new nodes, this was not feasible since extracting
descendant-or-self would give me duplicate nodes, which would hinder the
process of creating new nodes and lead to stuttering of the output.

Separating out the search functionality into `find_between()` really
helped me fix those two inconsistencies
@zkamvar zkamvar marked this pull request as ready for review May 6, 2021 19:12
@zkamvar zkamvar changed the title add asis node; demonstrate math protection Add Tickbox and Math Protection May 7, 2021
Copy link
Member

@maelle maelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is brilliant! 👏 Thank you!

Apart from my small comments/questions, would it make sense to add a paragraph about Math in the README?
Also, when would one not want to use protect_math()?

NEWS.md Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
R/add_md.R Outdated Show resolved Hide resolved
R/add_md.R Show resolved Hide resolved
R/add_md.R Outdated Show resolved Hide resolved
R/add_md.R Outdated Show resolved Hide resolved
R/add_md.R Show resolved Hide resolved
R/add_md.R Show resolved Hide resolved
R/asis-nodes.R Outdated Show resolved Hide resolved
R/asis-nodes.R Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
@zkamvar
Copy link
Member Author

zkamvar commented May 7, 2021

Apart from my small comments/questions, would it make sense to add a paragraph about Math in the README?

Yes, it would!

Also, when would one not want to use protect_math()?

I can really only think of edge cases (e.g. writing a BASIC tutorial or describing the Burroughs-Wheeler Transform with ^ and $ as the delimiters.)

README.md Outdated
Comment on lines 245 to 318
### LaTeX equations

While markdown parsers like pandoc know what LaTeX is, commonmark does
not, and that means LaTeX equations will end up with extra markup due to
commonmark’s desire to escape characters.

For users of the `yarn` object, if you have LaTeX equations that use
either `$` or `$$` to delimit them, you can protect them from formatting
changes with the `$protect_math()` method:

``` r
path <- system.file("extdata", "math-example.md", package = "tinkr")
math <- tinkr::yarn$new(path)
math$tail() # malformed
#|
#| $$
#| Q\_{N(norm)}=\\frac{C\_N +C\_{N-1}}2\\times
#| \\frac{\\sum *{i=N-n}^{N}Q\_i} {\\sum*{j=N-n}^{N}{(\\frac{C\_j+C\_{j-1}}2)}}
#| $$
math$protect_math()$tail() # success!
#|
#| $$
#| Q_{N(norm)}=\frac{C_N +C_{N-1}}2\times
#| \frac{\sum _{i=N-n}^{N}Q_i} {\sum_{j=N-n}^{N}{(\frac{C_j+C_{j-1}}2)}}
#| $$
```

Note, however, that there are a few caveats for this:

1. The dollar notation for inline math must be adjacent to the text.
E.G. `$\alpha$` is valid, but `$ \alpha$` and `$\alpha $` are not
valid.

2. We do not currently have support for bracket notation

3. If you use a postfix dollar sign in your prose (e.g. BASIC commands
or a Burroughs-Wheeler Transformation demonstration), you must be
sure to either use punctuation after the trailing dollar sign OR
format the text as code. (i.e. `` `INKEY$` `` is good, but `INKEY$`
by itself is not good and will be interepreted as LaTeX code,
throwing an error:

``` r
path <- system.file("extdata", "basic-math.md", package = "tinkr")
math <- tinkr::yarn$new(path)
math$head(15) # malformed
#| ---
#| title: basic math
#| ---
#|
#| BASIC programming can make things weird:
#|
#| - Give you $2 to tell me what INKEY$ means.
#| - Give you $2 to *show* me what INKEY$ means.
#| - Give you $2 to *show* me what `INKEY$` means.
#|
#| Postfix dollars mixed with prefixed dollars can make things weird:
#|
#| - We write $2 but say 2$ verbally.
#| - We write $2 but *say* 2$ verbally.
math$protect_math() #error
#| Error: Inline math delimiters are not balanced.
#|
#| HINT: If you are writing BASIC code, make sure you wrap variable
#| names and code in backtics like so: `INKEY$`.
#|
#| Below are the pairs that were found:
#| start...end
#| -----...---
#| Give you $2 to ... me what INKEY$ means.
#| Give you $2 to ... 2$ verbally.
#| We write $2 but ...
```

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maelle, let me know if this looks okay for you for the README

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does! I just wonder what folks who do not use the yarn object should do: probably update their code to use the yarn object?

Copy link
Member Author

@zkamvar zkamvar May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason why I am using the yarn object here is because we can use our namespace object in the internal parsing, but I can export that object so that we can use it as default.

structure(c(md = "http://commonmark.org/xml/1.0"), class = "xml_namespace")

@zkamvar zkamvar requested a review from maelle May 7, 2021 21:55
Copy link
Member

@maelle maelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the updates!! I only added a small comment.

@zkamvar
Copy link
Member Author

zkamvar commented May 11, 2021

Excellent! I realized I need to add a couple of more touches to make it complete (e.g. clearing the namespace of the new nodes before copying to avoid the issue of duplicating namespace).

Thank you so much for reviewing this!

 - add xml_ns_strip() before node insertion events to avoid duplicate
   namespaces from accumulating (see
   https://community.rstudio.com/t/adding-nodes-in-xml2-how-to-avoid-duplicate-default-namespaces/84870)
 - export md_ns() and protect_math() for users of to_xml()
 - protect_tickbox() is now default in to_xml()
 - updated README
 - updated asis tests
@zkamvar zkamvar merged commit bd76a24 into master May 11, 2021
@zkamvar zkamvar deleted the wear-your-mathk branch May 11, 2021 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check that task lists aren't lost by a loop
2 participants